[pull] master from GaijinEntertainment:master#1017
Merged
Conversation
STYLE024: ExprSafeAt (?[]) on table<> / array<> / pointer-to-(table| array|pointer) requires unsafe per ast_infer_type.cpp (errors unsafe_table_safe_index / unsafe_array_safe_index / unsafe_pointer_safe_index), but the visitor only marked ExprAt. Add a preVisitExprSafeAt mirroring the compiler's locality check so the wrap is not flagged as redundant. STYLE025: unsafe(expr) sets alwaysSafe only on its immediate child (ds2_parser.ypp:2275, no descent). When the only unsafe leaf sits inside a let-ref binding to a non-local non-temporary RHS (e.g. var s & = *reinterpret<T?>(raw)), the let-ref binding itself requires unsafe at statement level (ast_infer_type.cpp:4989), and no single expression-form wrap can satisfy both the let-ref check and the buried leaf's own unsafe check. Detect this via a stack frame (count + has_non_local_let_ref) propagated alongside the existing leaf count, and stay silent when the flag is set. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PR #2746 (Phase 2+3+4) added unfused SimNode_ArrayAt_I64 / _U64 for int64-/uint64-indexed array access. The existing fusion engine in simulate_fusion_at_array.cpp hardcoded evalInt(context) and a uint32_t(...) narrowing in every IMPLEMENT_OP2_SET_NODE macro, so it could not fire on the new int64-indexed nodes - every arr[i64] access fell off the fast path. This commit adds parallel _I64 and _U64 fusion families: * SimNode_Op2ArrayAt_I64 / _U64 base structs alongside the existing SimNode_Op2ArrayAt. * Three new sections each (ArrayAtR2V scalar / vector / ArrayAt PTR) for I64 and U64. Each section redefines the IMPLEMENT_OP2_SET_NODE family to: - read the right operand via r.subexpr->evalInt64(context) [or evalUInt64] instead of evalInt - read the right operand from a register as int64_t [or uint64_t] instead of uint32_t - bounds-check the int64 path as idx<0 || uint64_t(idx) >= size - bounds-check the uint64 path as idx >= size - keep uint64_t(idx) * uint64_t(stride) + offset arithmetic * createFusionEngine_at_array() now registers all three families (existing int32, new int64, new uint64). Table fusion needed no code change: SimNode_TableIndex<KeyType> is template-parameterized and IMPLEMENT_SETOP_NUMERIC(TableIndex) already registers int64_t and uint64_t key types. Adds test_fusion_table_i64.das to lock in correctness for int64/uint64-keyed tables. Fusion was confirmed firing via options log_nodes: (ArrayAt_I64LocConst #32 {3,0,0,0} 0x4 0x0) (ArrayAtR2V_I64LocConst_TT<int> #32 {5,0,0,0} 0x4 0x0) Slice C (char* At fusion) was investigated and confirmed-empty: the typer at src/ast/ast_infer_type.cpp:3088 rejects non-isIndex indices for the fixed-array path (SimNode_At), so int64 indices never reach that SimNode. SimNode_PtrAt for pointer-indexing has no fusion engine at all (neither int32 nor int64). Out of scope. Tests + benches: * tests/long_array_table/test_fusion_arr_i64.das - 7 tests covering const/local/argument compute modes and float value type * tests/long_array_table/test_fusion_table_i64.das - 5 tests exercising int64/uint64 keys + overwrite + via-argument * benchmarks/fusion/bench_arr_at_i64.das, benchmarks/fusion/bench_table_index_i64.das - side-by-side int-vs-int64 indexing throughput, baseline for downstream phases 8952/8952 interpreter tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Original bench mixed for-loop (int) with while-loop (int64/uint64),
so the int64 numbers conflated fusion cost with while-loop harness
overhead. Rewrite all three index types to the same shape:
for (i in range(N)) // int
for (i in range64(N64)) // int64
for (i in urange64(N64)) // uint64
Also adds the missing uint64-write subtest in bench_arr_at_i64.das
so the array bench has full read+write cross-product across all three
index types.
New numbers (per-op ns, interpreter):
array read: int=3 int64=4 uint64=3
array write: int=5 int64=14 uint64=9
table read: int=9 int64=10 uint64=10
Reads are within ~33% across all index types (uint64 read matches
int32 at parity). The int64-write gap (5 -> 14) is a real cost
discrepancy, not harness overhead — worth a follow-up look but
out of scope for the fusion-correctness PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous int64-write bench measured 14 ns/op vs int's 5 ns/op. SimNode dump showed the gap was entirely in the BODY, not the index fusion: `arr[i] = int(i) * 2` for an int64 i emits `MulAnyConst(Cast_to_int(GetLocalR2V(i)), 2)` — an explicit narrowing SimNode plus an unfused MulAnyConst (vs int's fused MulLocConst). The LHS `ArrayAt_I64LocLoc` was firing fine — the cast/mul on the RHS was the real cost. Two changes to isolate just the index/fusion cost: 1. Write a constant (`arr[i] = 1`) instead of `int(i) * 2`. No cast in the body, so the bench measures the index path only. 2. Wrap the inner loop in `for (_j in range(OUTER))` (OUTER = 10). Each `b |> run` body now does OUTER * N inner ops, amortizing per-call harness overhead. Apples-to-apples numbers (per-op ns, OUTER * N = 100000 ops/run): array read: int=3 int64=3 uint64=3 array write: int=5 int64=5 uint64=5 table read: int=9 int64=10 uint64=10 All three index types at parity for array access; table reads within 1 ns. Confirms the Phase 5 fusion variants land int64/uint64 indexing on the same fast path as int32. Note: separate from Phase 5, `Mul(Cast_to_int(int64Local), Const)` is not fused — int's `MulLocConst` doesn't match when the LHS is a cast result. That's an independent fusion opportunity not covered by this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…se5-fusion-i64 longarr phase 5: fusion variants for arr[i64] / arr[u64] indexing
cond ? T(a) : T(b) where both branches apply the same workhorse cast emits two ExprCall nodes that do identical work. Hoist to T(cond ? a : b) — one call instead of two, same evaluation semantics. Suggested in the PR #2753 review by @aleksisch: "string(a ? b : c) instead of a ? string(b) : string(c)? Can be added to linter too btw. It's not first time such code was written." The rule reuses PERF020's 15-name workhorse-cast set (int/int8/int16/int64/uint/uint8/uint16/uint64/float/double/string/ bitfield/bitfield8/bitfield16/bitfield64) and fires when: - Both ternary branches resolve to the same workhorse cast name. - Both calls share the same target Type. - The user argument on both branches has the same baseType — so the hoisted T(cond ? a : b) typechecks without an intermediate cast. Different-arg-baseType cases (cond ? string(intV) : string(int64V)) intentionally do NOT fire — the rewrite would need a manual widen and that is left to the author. The rule fires anywhere, including inside closure bodies, matching PERF020's stance: a redundant cast is redundant regardless of scope. Argument-count gate accepts any >=1 to handle string(int) (bound with explicit args({"value","hex","context","at"}) -> 4 daslang args), which the original single-arg gate would have missed. Drive-by: same-PR daslib sweep -- three perf_lint.das self-hits (call.func.fromGeneric != null ? string(.fromGeneric.name) : string(.name)) hoisted to the PERF021-suggested form. Zero residual PERF021 hits in daslib post-fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot review flagged on PR #2759: the rule's first-arg-only check mis-fires on `cond ? string(a, true) : string(b, false)`. The suggested rewrite `string(cond ? a : b)` silently drops one branch's `hex=true` — real semantic change. Reproduced locally. Fix: add `cast_call_tail_args_equal(le, re)` — compares arguments[1..] structurally between branches via `expr_equal_struct(.., require_pure= false)`. Skips ExprFakeContext / ExprFakeLineInfo (auto-injected for Context*/LineInfoArg* params; differ at every call site by design). Wired into `check_perf021_ternary_cast_hoist` as a final gate after the existing first-arg baseType check. Fixture extended: - `bad_same_hex_string` — `string(a, true) : string(b, true)` → fires - `good_different_hex_string` — `string(a, true) : string(b, false)` → silent `expect 31208:11` → `expect 31208:12`. Also collapsed adjacent return-early guards in check_perf021_ternary_cast_hoist and the tail-args loop per STYLE016. Verified: dastest utils/lint/tests (29/29 pass), perf_lint.das self-lint clean, daslib sweep 0 residual PERF021 hits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
With more checks we should be able disable some of them project-wide. This commit introduces support for disabling and enabling checks using command line.
It may be useful to add pre-push hook with das-fmt and lnter checks mirroring CI behaviour.
…erf021-ternary-cast-hoist lint: PERF021 — hoist common workhorse cast out of ternary
MemoryModel::allocate/free/reallocate at src/misc/memory_model.cpp:122/151/194-195 mask uint64 size with ~alignMask. alignMask was uint32_t, so ~alignMask zero-extends to 0x00000000FFFFFFF0 when ANDed with the uint64 size — silently dropping the high 32 bits for any allocation >= 4 GB. A 4 GB+15 request became 16 bytes; the function then took the shoe path with size=0, computed si = (0>>4)-1 = 0xFFFFFFFF, and dereferenced chunks[0xFFFFFFFF] — a wild-address read that crashed the process. Phase 1 widened the heap public API to uint64 but missed this field on both MemoryModel and LinearChunkAllocator. Fix is a one-word widening of each; the existing `(size + alignMask) & ~alignMask` lines pick up the wider type automatically, and the `DAS_VERIFYF(s <= UINT32_MAX)` policy guard in LinearChunkAllocator::allocate now fires correctly on >4 GB requests instead of seeing a silently-truncated size. Tests, all gated on DASLANG_HUGE_HEAP_TESTS=1: - tests-cpp/small/test_heap_64bit.cpp — new 4 GB-boundary test asserting bytesAllocated grows by >= 4 GB through PersistentHeapAllocator; existing 5 GB test moved to persistent_heap (default LinearHeapAllocator is uint32- bounded by design and now panics with a clear message). - tests/long_array_table/test_huge_array_resize_index.das (5 GB array<uint8>) - tests/long_array_table/test_huge_array_iterate.das (2.2 GB, four iteration shapes) - tests/long_array_table/test_huge_array_push_emplace_clone.das (push past INT_MAX) - tests/long_array_table/test_huge_array_index_offset.das (~4.4 GB array<int>; exercises uint64 stride*idx address math) All four daslang probes carry `options persistent_heap = true` (required for >4 GB arrays) and an inline gate (`static_if (typeinfo sizeof(type<int?>) < 8)` + has_env_variable check) so they silent-skip in CI without the env var. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
daslang: add pre-push hook
Adds a third benchmark target alongside m1_sql / m3_array / m3f_array_fold: m4_decs_fold runs the same chain shape through `_fold(from_decs_template(type<DecsCar>)...)`. Gives a tri-platform comparison (SQL vs array vs decs) under one chain spec. Shared scaffold in _common.das: `[decs_template(prefix="car_")] DecsCar` mirroring Car's 6 fields + `fixture_decs(n)` parallel to `fixture_array(n)`. Each benchmark file gains a `run_m4` + `[benchmark]` wrapper. Lambda quirks: explicit `$(c : Car)` annotations don't match the decs tuple element type, so m4 lanes use `_select(_.field)` macro form (auto-types via macro expansion). first_or_default_match's sentinel is a named-tuple literal matching the iter element shape. Skipped (Cat C — need new decs surface): indexed_lookup (eid-lookup), join_count (decs join design), zip_dot_product (decs zip surface). Tracked in `benchmarks/sql/M4_DECS_EXPANSION.md` with full first-sweep results matrix, Cat A/B split, suspect-0ns-list, and Wave 2-4 plans (Cat C surface adds / Slice 5+ splice arms / per-chain component-narrowing perf). Wave 1 results (100K, INTERP): - Cat A m4 beats SQL on most aggregate/filter shapes (1.5-9x), ~3-5x slower than m3f due to 6-component multi-iter for-loop overhead even when chain reads one field - Cat B m4 falls to eager bridge (~100-130 ns); becomes the regression guard for each plan_decs_unroll splice arm as Slice 5+ lands Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…-block Sweeps 197 b|>run blocks across 51 SQL benchmarks, inserting `b |> accept(<result_var>)` immediately after the inner let-binding. Uses the existing `[sideeffects]` helper `accept` from dastest/testing.das:172. **Why:** Documents intent (result must escape) and protects against future DCE if anyone modifies the chains. ast_dump verified the calls survive compilation: e.g. take_count m3f lowers to a full spliced invoke + accept(b, rows) + empty/failNow guard — the chain is genuinely running. **Finding:** This sweep was initially aimed at the 11 m3f=0 ns/op cases suspected of being DCE'd (select_count, take_count, take_count_filtered, take_sum_aggregate, reverse_take, skip_take, distinct_take, any_match, element_at_match, first_match, first_or_default_match). After the sweep, those cells still report 0 ns/op. Investigation: ast_dump shows the spliced loop body fully expanded and the accept call alive. The zeros are real — dastest reports total_time / n where n=100000, and a body cost ≤~100us divides to ≤1 ns/op and rounds to 0. For take(N)+to_array shapes the divisor is 100000 but only TAKE_N=1000 elements are processed, so the unit underreports actual per-element cost. Not a DCE artifact; matrix is honest. The accept guards stay regardless — cheap insurance for future bench changes, and the convention is uniform across all four lanes (m1/m3/m3f/m4). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extends plan_order_family at daslib/linq_fold.das:1230 to recognize `first` and `first_or_default` alongside the existing `take(N)` terminator on order_by / order_by_descending / order / order_descending chains. **Why:** Prior to this change, `_fold(arr._order_by(key).first())` cascaded to plan_loop_or_count which emitted the full O(N log N) sort + index lookup. The bench `benchmarks/sql/sort_first.das` showed m3f=722 ns barely improving on m3=713 ns. After the splice arm: m3f=42 ns (17× win, matches the m1 SQL baseline of 37 ns within 14%). m4_decs_fold also improves 802→121 ns. **Emission:** - order_by + first → `min_by(top, key)` directly (O(N) single pass). Matches `min_by_impl`'s panic-on-empty semantics → identical to `order(...).first()`. - order_by_descending + first → `max_by(top, key)`. - bare order + first → `min(top)` (or `max` for descending). - order_by + first_or_default(d) → `top_n_by(top, 1, key) |> first_or_default(d)` since no `min_by_or_default` helper exists. One extra 1-elem array allocation but cleanly handles the empty case. - where_ + order_by + first / first_or_default: mirrors the existing prefilter- buffer pattern, calling min_by / max_by / top_n_by(_, 1, _) on the filtered buf. **New helper:** `order_min_call_name(orderName, hasKey)` returns "min" / "max" / "min_by" / "max_by" based on direction + key presence. **Recognizer guard:** first/first_or_default must be terminal — `i != length(calls) - 1` returns null so any trailing op cascades to tier-2. **Tests:** - tests/linq/test_linq_fold.das: 7 new parity cases under `test_fold_order_by_first` (order_by + first, order_by_descending + first, where + order_by + first, order_by + first_or_default with empty/non-empty/filtered-empty sources). - tests/linq/test_linq_fold_ast.das: 4 new AST-shape gates confirming the splice emits min_by / max_by / top_n_by + first_or_default and DOES NOT emit order_by / first / first_or_default itself. Full sweep: 1182 linq tests + 376 fold tests + 188 AST tests, all green. MCP lint + CI lint clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…matrix update plan_order_family's order+first splice was emitting min_by directly. min_by returns an uninitialized ref on empty source — silently swallows the panic that eager first() guarantees. Fix per Copilot review on closed PR #2757: - No-where + array source: wrap in invoke($(src){ panic if empty; return min_by(src,key) }, top). Zero allocation, one branch — preserves the 17× sort_first win. - No-where + iterator source: emit top_n_by(_, 1, _) |> first() — bounded n=1 heap; first() panics on empty array. - where + order + first: insert `panic if empty(buf)` stmt before the buffer min_by. Two new regression tests assert that first() on empty array and on filtered-empty source both panic. M4_DECS_EXPANSION.md gains a section logging the splice arm + the sort_first 722→41 ns (17×) win. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses Copilot review on PR #2760: the recognizer captured `orderKey` whenever the order call had ≥2 args, but `hasKey` was only true for `order_by` / `order_by_descending`. For `order(arr, cmp)` (and `order_descending(arr, cmp)`), splice emitted bare `min(arr)` / `max(arr)` / `top_n(arr, N)` — silently dropping the user-supplied comparator. Same bug pre-existed for the `take` arm. Both are fixed by a single bail: when the order call is `order` / `order_descending` AND argCount >= 2, return null from the recognizer. Chain falls through to `fold_linq_default`, which rewrites to `order_to_array(arr, cmp) |> first()` — semantics preserved. 3 functional regression tests (order/order_descending + first/take with cmp) and 1 AST gate (asserting no min/max/top_n splice + sort step survives) — all 3 functional tests failed before the fix (returned min instead of cmp-honoring result) and pass after. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ssion PR #2753's new PERF020 rule (`T(x)` where x is already T) caught three real sloppy-codegen sources that all emitted unnecessary workhorse casts the interpreter doesn't fold: 1. **daslib/linq.das average()** (iter + array overloads, lines 1529 / 1541) — `total += double(x)` fires when caller pre-casts the projection to double (e.g. `_select(double(_.price)).average()`). Wrap in static_if guarded by `typeinfo stripped_typename(x) == typeinfo stripped_typename(default<double>)` so the cast only emits when needed. 2. **linq_fold.das average splice** (plan_loop_or_count, line 734) — emitted `double(accName) / double(cntName)` unconditionally. accName carries accType, which for double-projected chains is already double. Branch on `accType.baseType == Type.tDouble` to skip the cast. 3. **linq_fold.das count-shortcut emissions** (emit_length_shortcut line 432 and the plan_zip length-shortcut line 3362) — emitted `int(length(...))` for count. length already returns int, so the cast is dead weight. Split into `length(...)` for count and `int64(length(...))` for long_count. No semantic change. Closes the CI lint failure on PR #2760 (5 new + 2 of 4 pre-existing PERF020 warnings in changed files). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Lint add enable/disable
… compat)
mapfile is bash 4+ only. macOS ships bash 3.2.57 as /bin/bash (last GPLv2
version), so the hook fails on every Mac developer's first push with:
.githooks/pre-push: line 71: mapfile: command not found
Replace with a portable `while read; CHANGED+=("$line"); done < <(...)` —
same semantics, works on bash 3.2 and bash 4+.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…se8a-alignmask longarr phase 8a: alignMask uint32 truncation + huge-array probes
…ench-order-first linq_fold: m4_decs_fold bench lane + anti-DCE accept sweep + order_by+first splice arm
…ok-bash32 .githooks/pre-push: portable read loop (bash 3.2 compat for macOS)
Extends plan_zip's terminator dispatch with sum/min/max/average + first/first_or_default/any/all/contains, mirroring plan_loop_or_count's lane emission via generalized helpers. emit_accumulator_lane / emit_early_exit_lane now take multi-source via parallel arrays (srcNames + topExprs). For-loop emission branches on length: 1-source uses $i(itName); 2-source uses literal `itA, itB` because qmacro for-loop iter-var position doesn't accept $i() splice in the multi-iter form. New finalize_lane_emission helper handles the 1- vs 2-arg invoke wrap. finalize_invoke loops over all block args to set can_shadow (was hardcoded to args[0]). plan_zip threads `let it = (itA, itB)` via preCondStmts so itName resolves to the tuple inside the loop body for where/projection/ predicate eval. Accumulator without projection bails to tier-2 (tuple has no += so sum/min/max/average wouldn't typecheck anyway). Tests: 16 new (14 behavioral parity covering sum/min/max/average, where+sum, where+long_count, first no-proj/proj, where+first, first_or_default, any no-pred/empty/pred, all true/false, contains hit/miss + 2 AST shape asserting single multi-iter for-loop + no surviving zip/sum/first calls). 220/220 ast + 369/369 fold interpret tests green; 1169/1169 AOT sweep across tests/linq. Existing single-source plan_loop_or_count behavior preserved (call sites wrap params in 1-element arrays at the boundary). Deferred: last/last_or_default/single/single_or_default/element_at/ element_at_or_default/aggregate on zip (TERMINAL_WALK lane); 3..8-ary zip splice (Z4/Z5); any-no-pred length shortcut on zip. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PR #2742 review #r3270242609: classify_terminator("long_count") returns ACCUMULATOR, so my new ACCUMULATOR dispatch fired first and bailed (projection == null), regressing `zip(...).long_count()` and `zip(...).where(p).long_count()` to tier-2 cascade instead of the existing COUNTER length-shortcut / counter loop. Fix: gate the ACCUMULATOR branch with `!isCounter`. long_count now flows through the existing COUNTER path uniformly (already handles both bare via length shortcut and chain via counter loop). Strengthened test_zip_long_count_uses_length_shortcut with a count_call("long_count") == 0 assertion — the previous count_inner_for_loops == 0 check was satisfied trivially by tier-2 passthrough (raw call chain has no for-loops in its immediate AST), so the regression slipped through. Added test_zip_where_long_count_emits_counter_loop to guard the chain case analogously (for-loops == 1, no surviving long_count call). Note: dropped a candidate `count_call("length") >= 2` assertion since count_call doesn't recurse into ExprOp3 ternary (where the length(srcA) < length(srcB) ? ... lives in the shortcut). The two assertions above (for_loops + long_count) discriminate the three paths — length shortcut / counter loop / tier-2 cascade — by elimination. PR #2742 review #r3270242659: updated the accumulator section comment to note long_count routes through COUNTER, not the projection-required ACCUMULATOR path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ix tracking test) PR #2742 review #r3270337491 — added explicit bounds guard at finalize_lane_emission entry: let nSrcs = length(srcNames) if (nSrcs != 1 && nSrcs != 2) panic("... only 1- or 2-source supported (got {nSrcs}) ...") if (length(topExprs) != nSrcs) panic("... length mismatch ...") Defensive: helper is private + called from 2 sites passing 1- or 2-source arrays per protocol, but the guard trips a clear error if future Z4/Z5 (3..8-ary zip) work routes through this without extending the branch first. PR #2742 review #r3270337476 — emit_accumulator_lane.average semantics divergence from linq.das. PRE-EXISTING in helper: accumulates in accType (often int → overflow risk) + returns NaN on empty cnt, while linq.das average accumulates in double + returns 0lf on empty. Affects single-source plan_loop_or_count too. Existing fold test "average: empty → NaN" locks in the current divergent behavior. Fix DEFERRED to follow-up PR (uniform fix across both planners + update of the single-source test). This PR adds a tracking test (`test_zip_average_empty_returns_zero_when_fixed`) that: - Has the target function (`target_zip_average_empty_fold`) wired up - Calls `t->skip(...)` with a clear deferral message at the top - Documents the desired post-fix assertion as a comment - Acts as a discoverable to-do for the follow-up PR (un-skip + assert) Per Boris's reinforcement (PR #2742): deferring is fine, adding disabled tests is fine, but not adding tests at all for the bugs we found — NOT fine. Verification: lint clean, 222/222 interpret (221 passed + 1 skipped), 1171/1171 AOT (1170 passed + 1 skipped). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Spike for the next big chunk of the 64-bit array/table widening project (PR-D, linq surface). Question: can a function accept `int | int64` as a single signature and fork inside the body with static_if, so the ~10 linq functions targeted by PR-D widen with one signature each instead of doubled overloads? Answer: yes -- `def take_or(x : int | int64)` already parses (the disjunction-parameter shape was used in tests/language/option_type.das for ref/auto resolution). What was missing was a clean dispatch predicate; `stripped_typename(x) == "int"` is a string-compare hack for something this prominent. Adds two `typeinfo` traits in src/ast/ast_infer_type.cpp next to `is_numeric`, following the `is_string` pattern (baseType match + `dim.size() == 0`): typeinfo is_int(x) -> baseType == tInt && dim.size() == 0 typeinfo is_int64(x) -> baseType == tInt64 && dim.size() == 0 tests/long_array_table/test_int_int64_disjunction.das pins both halves (disjunction-parameter dispatch + the two new traits) with static_assert type-contract probes so silent reverts on either side flip the test red. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Records the 8.3× win (m3f 58→7 ns/op) after the cherry-picked plan_zip accumulator + early-exit lane work fires on the zip(xs,ys)._select(_._0 * _._1).sum() chain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…se8b-int-int64-disjunction phase 8b: typeinfo is_int / is_int64 + int|int64 disjunction spike
…_array Extends Approach Z direct-inline splice (PR #2750) to cover the remaining terminator surface for from_decs* chains. Slice 3a (accumulator family): min/max/average added to emit_decs_accumulator. Match non-decs emit_accumulator_lane semantics — min/max keep a `first` flag hoisted above outer for_each_archetype; average keeps a running sum + count and divides via double() at end. sum/min/max/average require a scalar _select. Slice 3b (early-exit): new emit_decs_early_exit for first/first_or_default/ any/all/contains. Outer becomes for_each_archetype_find (returns bool; inner block returns true to stop the archetype walk). any/all/contains use the find's return value directly (all negates). first/first_or_default thread a found flag + result via prelude/tail. Slice 3c (to_array): new emit_decs_to_array hoists `var buf` above outer for_each_archetype and per-element push_clones the projection (or named tuple when no _select). Dispatched via the implicit "no recognized terminator" path since linqCalls marks to_array as skip=true. Refactor: build_decs_tup_bind + build_decs_inner_for helpers extracted from Slice 2's emit_decs_accumulator so the new emitters share the for-body shape. DecsBridgeShape gains elementType (cloned from resVar._type.firstType) for to_array / first / first_or_default when no projection is present. Tests: 14 new functional parity + 3 AST-shape gate tests in tests/linq/test_linq_from_decs.das. All 29 file-local tests green; 1146 linq + 234 decs interp tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CI's das-lint catches this where the MCP lint doesn't (different rule set). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1. emit_decs_accumulator (average): correct the empty-source comment. Both numerator and denominator are cast to double before division — empty → 0.0/0.0 → IEEE NaN. Never an int-division panic. 2. plan_decs_unroll (implicit to_array): gate the fallthrough on `expr._type.isGoodArrayType`. Without the gate, decs-bridge chains that end in iterator output — `_fold(from_decs_template(...))`, `_fold(...)._where(...)` — silently materialized into array<T> instead of preserving the iterator the user expected. Iterator-typed chains now return null and cascade to tier-2 fold_linq_default. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rminators Closes the deferred shape coverage from PR #2751: chained `_select` chains, `_where` after `_select`, `_count(pred)` / `_long_count(pred)`, and the `_min_by` / `_max_by` retention terminators on `from_decs_template`. Architecture shift: emit_decs_accumulator / early_exit / to_array now take a shared `DecsChainInfo` (built by `compute_decs_chain_info`) and wrap their per-element action via `wrap_decs_chain` instead of a singular projection + whereCond pair. Each `_select` introduces a fresh `decs_sel{N}` bind whose type carries forward; the reverse-walk wrapper emits `let bindN = proj` for selects and `if (pred) ...` for wheres in chain order, so after-select predicates see the projection output. `_count(pred)` / `_long_count(pred)` ride the same path: the accumulator emitter detects a 2-arg terminator call, peels its predicate against `finalBind`, and wraps the counter increment with `if (pred) ...`. New `emit_decs_min_max_by` mirrors the min/max accumulator but stores both key + element (workhorse key via `<` / non-workhorse via `_::less`). To keep `long_count(pred)` actually callable, linq.das gains `long_count(iter; pred)` + `long_count(arr; pred)` overloads matching the existing `count(iter; pred)` / `count(arr; pred)` shape, and linq_boost.das gains `_long_count` shorthand alongside `_count`. Broader 64-bit sweep (take/skip/element_at/top_n N-parameter) is gated on 64-bit arrays + tables and noted at the tail of benchmarks/sql/LINQ.md. tests/linq/test_linq_from_decs.das: 11 new tests (chained select sum, select→where sum, where→select→where sum, _count(pred), _long_count(pred), chained select to_array, _min_by, _max_by, plus three AST-shape gates). Full sweep: 1171 linq tests + 239 decs tests green; lint clean (MCP + CI). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lan-zip-accum-early-exit-v2 linq_fold: plan_zip accumulator + early-exit terminators (resurrect orphaned #2742)
fix: add null check for subexpression type in ExprAt handling
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )